Production of soluble recombinant protein without n-terminal methionine

ABSTRACT

The present invention is directed to the cells, compositions and methods for the production of recombinant protein, wherein an f-met group on the 5′-terminus is enzymatically removed. In particular, the invention is directed to a production process for obtaining high levels of soluble recombinant CRM 197  protein from  E. coli . Cells preferably contain one or more mutations of disulfide reductase genes, so that disulfide reductase activity is reduced. The invention also relates to purification method for CRM 197  as well as characterization of properly folded CRM 197  protein.

REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.16/154,020 filed Oct. 8, 2018, which issued as U.S. Pat. No. 10,597,664on Mar. 24, 2020, and a continuation-in-part of U.S. application Ser.No. 15/114,642 filed Jul. 27, 2016, which issued as U.S. Pat. No.10,093,704 on Oct. 9, 2018, which is a National Stage Application, under35 U.S.C. § 371, of International Application No. PCT/US2015/14130 filedFeb. 2, 2015, which claims priority to U.S. Provisional Application No.61/934,377 filed Jan. 31, 2014, the entirety of each of which isspecifically incorporated by reference.

BACKGROUND 1. Field of the Invention

The invention is directed to methods and compositions for the expressionand purification of products such as peptides and proteins inmicroorganisms. In particular, pre-products are expressed recombinantly,wherein the cytoplasm of the microorganism alters the expressedpre-products to produce products in a final or usable form. Alterationsinclude shifting of the redox state of the cytoplasm and site directedcleavage and/or ligation.

2. Description of the Background

Diphtheria toxin (DT) is a proteinaceous exotoxin synthesized andsecreted by pathogenic strains of Corynebacterium diphtheriae. Thesepathogenic strains contain a bacteriophage lysogen that carries thetoxin gene. Diphtheria toxin is an ADP-ribosylating enzyme that issecreted as a proenzyme of 535 residues and processed by trypsin-likeproteases with release of two fragments (A and B). Fragment A uses NADas a substrate, catalyzing the cleavage of the N-glycosidic bond betweenthe nicotinamide ring and the N-ribose and mediating the covalenttransfer of the ADP-ribose (ADPRT activity) to the modified histidine715 (diphthamide) of the elongation factor EF-2. This post-translationaldiphthamide modification inactivates EF-2, halting protein synthesis andresulting in cell death. The A fragment of DT (also named C domain)carries the catalytic active site and is the only fragment of the toxinrequired for the final step of intoxication. The R domain, carried onthe B fragment, mediates binding to receptors on the host cell surfaceand the T domain, also carried on the B fragment, promotes thepH-dependent transfer of fragment A to the cytoplasm. An Arginine-richdisulfide-linked loop connects fragment A to fragment B (or domain C todomains TR). This inter-chain disulfide bond is the only covalent linkbetween the two fragments after proteolytic cleavage of the chain atposition 186. The isolation of various nontoxic and partially toxicimmunologically cross-reacting forms of diphtheria toxins (CRMs or crossreacting materials) resulted in discovery of CRM₁₉₇ (Uchida et al.,Journal of Biological Chemistry 248, 3845-3850, 1973; see also Gianniniet al. Nucleic Acids Res. 1984 May 25; 12(10):4063-9). Preferably, CRMscan be of any size and composition that contain all or a portion of DT.

CRM₁₉₇ is an enzymatically inactive and nontoxic form of diphtheriatoxin that contains a single amino acid substitution G52E. This mutationcauses intrinsic flexibility of the active-site loop in front of theNAD-binding site and reduces the ability of CRM₁₉₇ to bind NAD andeliminates toxic properties of DT (Malito et al., Proc Natl Acad. Sci.USA 109(14):5229-342012) Like DT, CRM₁₉₇ has two disulfide bonds. Onedisulfide joins Cys186 to Cys201, linking fragment A to fragment B. Asecond disulfide bridge joins Cys461 to Cys471 within fragment B. BothDT and CRM197 have fragment A-associated nuclease activity (Bruce etal., Proc. Natl. Acad. Sci. USA 87, 2995-8, 1990).

CRM₁₉₇ is commonly used as the carrier protein for protein-carbohydrateand hapten-protein conjugates. As a carrier protein, CRM₁₉₇ has a numberof advantages over diphtheria toxoid as well as other toxoid proteins,many of which have been documented (Shinefield Vaccine, 28:4335, 2010,Broker et al, Biologicals, 39:195 2011). For example, since CRM₁₉₇ isgenetically detoxified, it retains a larger complement of lysines, whichare used for conjugation but are blocked by chemical toxoiding. CRM₁₉₇has proven to be an effective carrier protein for Streptococcuspneumonia capsular polysaccharides, as evidenced by the success ofPREVNAR™ (Pfizer), a vaccine consisting of up to 13 capsularpolysaccharides chemically linked to CRM₁₉₇. There is also evidencesuggesting that compared with tetanus toxoid, there is lesscarrier-induced suppression of the immune response, especially whenthere are many individual polysaccharides linked to the same carrierprotein.

CRM₁₉₇ and native DT have a similar affinity for the diphtheria toxinreceptor (DTR), which has an identical amino acid sequence to the HB-EGFprecursor pro-HB-EGF (Mitamura et al., J. Biol. Chem. 272(43):27084-90,1997). CRM₁₉₇ binds to the soluble form of HB-EGF, as well as to themembrane form pro-HB-EGF, and inhibits HB-EGF mitotic action bypreventing its binding to EGF receptor. Thus CRM₁₉₇ may also have afuture role in cancer therapy (Miyamoto et al., Anticancer Res.November-December 27(6A):3713-21, 2007).

Although CRM₁₉₇ has been produced in the original host Corynebacterium,yields are low, typically <50 mg/L and, in addition, Corynebacteriumgrowth is relatively slow as compared with, for example, E. coli.Corynebacterium strains have been engineered to produce CRM₁₉₇ at higherlevels (e.g., see U.S. Pat. No. 5,614,382). CRM₁₉₇ has also beenexpressed in a strain of Pseudomonas fluorescens and expressed at highlevels. However, production of CRM₁₉₇ in other strains would beadvantageous such as strains at a BL1 level and strains that areinexpensive to culture and propagate. Production of CRM₁₉₇ in E. colihas mainly resulted in insoluble inclusion bodies (generally insoluble),which then requires a difficult refolding process, resulting in lowyields.

E. coli is the widely used host to produce recombinant proteins forresearch and therapeutic purposes. Recombinant proteins can be expressedin E. coli cytoplasm or periplasm. Periplasm has oxidative environmenttherefore many recombinant proteins containing disulfide bonds areproduced in the periplasm to ensure proper folding and solubility. Thesignal peptide that directs recombinant protein into periplasm isclipped off during the secretion process into the periplasm, resultingin production of protein with the native amino acid sequence. However,the translocation mechanisms that direct proteins to the periplasm havelimited capacity and so periplasmic expression level of recombinantproteins is usually low. On the other hand, expression in E. colicytoplasm can lead to grams of recombinant proteins per liter of cellculture. However, the E. coli cytoplasm has reducing environment, andrecombinant proteins containing disulfide bonds are usually insolublewhen expressed intracellularly. Another limitation to cytoplasmicrecombinant protein expression in E. coli is that in order to initiateexpression of recombinant protein in E. coli the coding sequence of theprotein should start from ATG codon, which is translated toformyl-methionine (N-terminal methionine). During intracellularexpression of recombinant protein, the N-terminal methionine is excisedby endogenous E. coli methionine aminopeptidase (MAP). This process isnot effective even if the size of the residue adjacent is optimal forcleavage due to overexpression of the protein and limited amount of MAP.Usually a substantial amount of purified recombinant protein containsmethionine as a first amino acid, which is not a part of mature proteinsequence. This is not desirable for the most of therapeutic proteins.The existing methods to ensure effective cleavage of formyl-methionineinclude treatment in vitro with recombinant MAP or adding MAP codingsequence to expression vector and co-express the MAP coding sequencewith recombinant protein to increase MAP amount. Both approaches aretime consuming and costly. Commercially available E. coli strains (e.g.,Origami®, Shuffle®) with gor−/trx− mutations available from Milliporeand New England Bio, respectively, can produce soluble, intracellularproteins containing disulfide bonds, but the cell strains are crippledand do not grow to a high density limiting production yield. Thus, whilethese strains are suitable for generating research material, their lowgrowth levels make them difficult to use commercially. Thus, a needexists for strains that express high levels of properly folded proteinsthat do not contain an N-terminal methionine.

SUMMARY OF THE INVENTION

The present invention overcomes the problems and disadvantagesassociated with current strategies and designs and provide newcompositions and methods for producing recombinant peptides andproteins.

One embodiment of the invention is directed to methods of producingrecombinant peptides and proteins in bacteria comprising: expressing theprotein from a recombinant cell containing an expression vector thatencodes the protein sequence, wherein the recombinant cell has a reducedactivity of one or more disulfide reductase enzymes, wherein the one ormore disulfide reductase enzymes comprises one or more of anoxidoreductase, a dihydrofolate reductase, a thioredoxin reductase, or aglutathione reductase, wherein the reduced activity of one or moredisulfide reductase enzymes results in a shift the redox status of thecytoplasm to a more oxidative state as compared to a recombinant cellthat does not have reduced activity of one or more disulfide reductaseenzymes, expressing a peptidase from the recombinant cell, which iseither integrated into the genome of the host cell, or expressedrecombinantly via an expression vector, that acts on the proteinexpressed and removes a formyl-methionine group from the 5′ portion ofthe protein; and isolating the protein. Peptidases that remove anN-terminal methionine include MAP, METAP1 (Methionine aminopeptidase 1),and METAP2 (Methionine aminopeptidase 2). Preferably the integrated geneor expression vector contains a ribosome binding site, an initiationcodon, and an expression enhancer region. Preferably the recombinantcell has a reduced activity of only one disulfide reductase enzyme or areduced activity of only two disulfide reductase enzymes. Preferably therecombinant cell is an E. coli cell or a derivative or strain of E.coli, and preferably the recombinant protein expressed comprises tetanustoxin, tetanus toxin heavy chain proteins, diphtheria toxoid, tetanustoxoid, Pseudomonas exoprotein A, Pseudomonas aeruginosa toxoid,Bordetella pertusis toxoid, Clostridium perfringens toxoid, Escherichiacoli heat-labile toxin B subunit, Neisseria meningitidis outer membranecomplex, Hemophilus influenzae protein D, Flagellin Fli C, Horseshoecrab Haemocyanin, and fragments, derivatives, and modifications thereof.Preferably the integrated gene and/or expression vector contains aninducible promoter for the peptidase, and expressing comprises inducingthe inducible promoter with a first inducing agent and contains anexpression vector that encodes the recombinant peptide or protein whichmay be inducible with a second inducing agent. Preferably the first andsecond inducing agents are the same, although they may be different.Preferably the first integrated gene or expression vector contains aninducible second promoter and expressing the peptidase comprisesinducing the inducible second promoter with the first inducing agent.Preferably isolating comprises chromatography wherein the chromatographycomprises a sulfate resin, a gel resin, an active sulfated resin, aphosphate resin, a heparin resin or a heparin-like resin. Preferably theisolated protein expressed is conjugated with polyethylene glycol and/ora derivative of polyethylene glycol or with a polymer such as, forexample, a polysaccharide, a peptide, an antibody or portion of anantibody, a lipid, a fatty acid, or a combination thereof.

Another embodiment of the invention comprises methods of producing apeptide comprising: expressing the peptide with a formyl-methioninegroup at a 5′ terminus of the peptide from a recombinant cell containingan expression vector that encodes the peptide, wherein the recombinantcell has a reduced activity of one or more disulfide reductase enzymesand the expression vector contains a promoter functionally linked to acoding region of the peptide, wherein the reduced activity of one ormore disulfide reductase enzymes results in a shift the redox status ofthe cytoplasm to a more oxidative state as compared to a recombinantcell that does not have reduced activity of one or more disulfidereductase enzymes, and expressing a peptidase from an integrated gene ofa recombinant cell that acts on the peptide expressed and removes theformyl-methionine group from the 5′ terminus of the peptide; andisolating the peptide. Preferably the expression vector contains aribosome binding site, an initiation codon, and an expression enhancerregion. Preferably the recombinant cell has a reduced activity of onlyone disulfide reductase enzyme or only two disulfide reductase enzymes.Preferably the one or more disulfide reductase enzymes comprises one ormore of an oxidoreductase, a dihydrofolate reductase, a thioredoxinreductase, or a glutathione reductase. Preferably the recombinant cellis an E. coli cell or a derivative or strain of E. coli and the peptideor protein comprises tetanus toxin, tetanus toxin heavy chain proteins,diphtheria toxoid, tetanus toxoid, Pseudomonas exoprotein A, Pseudomonasaeruginosa toxoid, Bordetella pertusis toxoid, Clostridium perfringenstoxoid, Escherichia coli heat-labile toxin B subunit, Neisseriameningitidis outer membrane complex, Hemophilus influenzae protein D,Flagellin Fli C, Horseshoe crab Haemocyanin, and fragments, derivatives,and modifications thereof. Preferably the promoter is an induciblepromoter and expressing comprises inducing the inducible promoter withan inducing agent. Preferably isolating comprises chromatography,wherein the chromatography comprises a sulfate resin, a gel resin, anactive sulfated resin, a phosphate resin, a heparin resin or aheparin-like resin. Preferably the peptide isolated is conjugated withpolyethylene glycol (PEG) and/or a derivative of PEG, or coupled to apolymer such as, for example, a polysaccharide, a peptide, an antibodyor portion of an antibody, a lipid, a fatty acid, or a combinationthereof.

Other embodiments and advantages of the invention are set forth in partin the description, which follows, and in part, may be obvious from thisdescription, or may be learned from the practice of the invention.

DESCRIPTION OF THE INVENTION

Soluble, intact recombinant CRM₁₉₇ was first produced inprotease-deficient E. coli (Bishai et. al 1987). However, the amount ofprotein production was very low. Subsequently, CRM₁₉₇ was produced in E.coli cells as inclusion bodies (Stefan A, et al. J Biotechnol. December20; 156(4):245-52, 2010; International Application Publication No. WO2011/126811, Chinese Patent Application No. 200610042194) or as solubleprotein directed to the periplasm by signal peptide (InternationalApplication Publication No. WO 2011/042516). The periplasm of E. coli isan oxidizing environment that allows the formation of disulfide bonds.CRM₁₉₇ has two disulfide bonds that are probably important for thecorrect folding and function, and for protein solubility.

Uncleaved chains of soluble recombinant CRM protein can be rapidlyproduced intracellularly and in commercial quantities frommicroorganisms and thereafter isolated and/or purified in largequantities and remain soluble. CRM is soluble in phosphate bufferedsaline (PBS, pH 7.5) and other similar buffers, and without harshtreatments. However, these chains contain a formyl-methionine (f-met) attheir N-terminus.

It has been surprisingly discovered that strains of microorganisms canbe engineered that fulfill the goals of: (i) expression of solublerecombinant proteins containing disulfate bonds in the cytoplasm; and(ii) effectively removing formyl-methionine from the sensiblerecombinant protein sequence. Expression strain have been developed thatproduces a large quantity of properly folded protein without aformyl-methionine at the protein's N-terminus.

As a first step, the shift from reducing to oxidative cytoplasmicenvironment was achieved by deletion of a gene. This single mutationallows disulfate bonds to be formed in the cytoplasm and does not affectcells vitality resulting in high cell density growth. In addition, theBL21 gor− strain has an additional copy of an indigenous methionineaminopeptidase (MAP) gene. Peptidases that remove an N-terminalmethionine include MAP, METAP1 (Methionine aminopeptidase 1), and METAP2(Methionine aminopeptidase 2), all of which can be utilized. The codingsequence of E. coli MAP, optionally under control of inducible promoter,was inserted in the place of deleted Gor locus of BL21 gor− to preventfurther disruption of the genome. Having inducible promoter allowsinitiate expression of additional MAP gene at selected time and onlywhen more MAP is needed to effectively remove formyl-methionine fromoverexpressed recombinant protein (which may be integrated or expressedfrom an expression vector). The tac-promoter was utilized as aninducible promoter for MAP. Thus, expression of the second MAP willstart at the same time as expression of the recombinant protein, forexample, when using IPTG to induce expression. Other inducible promotersfor MAP expression can be used. E. coli (BL21 Gor/met) strain is capableof producing soluble disulfide bonds containing proteins intracellularlyin grams quantity per liter of cell culture and sufficiently removeN-terminal methionine, resulting in the expression of native recombinantprotein sequence.

Insertion deletion was used to replace Gor with sequences ofchloramphenicol genes in between two Flipase recognition sites. Oncebacteria with insertion deletion was identified by being Chloramphenicolresistant, the flipase gene in a plasmid would was introduced to removethe chloramphenicol gene. The final bacterial strain has one flipaserecognition site remained and the Gor gene deletion.

Preferred CRM proteins produced are full length or partial regions suchas, for example, peptides, single or multiple domains or epitopes, andany specific region expressed from native CRM coding sequences includingCRM sequences that have been modified with one or more deletions,substitution and/or additions (e.g. conservative or non-conservative),and CRM sequences that have been modified with additional sequences(e.g., one or more promoters, start codons, and translation factor,ribosome or polymerase binding sites) that promote expression in a hostorganism. A preferred CRM protein is CRM₁₉₇. Preferred is expression ofCRM protein that is soluble and not otherwise bound as insolubleinclusion bodies of the cell. Preferred expression systems for theexpression and production of CRM proteins include microorganisms with anintracellular oxidative state. Preferred expression systems may berecombinant or native eukaryotic or prokaryotic cells whereinrecombinant cells include cells that contain a non-native CRM codingsequence. Preferred prokaryotic cells are strains of E. coli or anotherbacterial strain that contains one or more genetic alterations (e.g.,one or more deletions or mutations). Preferably the one or more geneticalterations shift the redox state of the cytoplasm of the cell to a moreoxidative state, as compared to wild-type, for example as disclosed inU.S. Pat. No. 7,410,788 (which is incorporated by reference).Alterations preferably reduce the activity of one or more disulfidereductase genes and/or other genes that reduce the oxidative state ofthe cytoplasm. Preferably, reduced activity is due to non-expression orreduced expression of one, two or multiple disulfide reductase or othergenes, or one or more mutations that reduce activity of one or moreexpressed disulfide reductase proteins or other proteins. Preferredstrains of microbial cells (e.g., recombinant, engineered or nativeeukaryotic or prokaryotic cells) have increased abilities to producenatively folded proteins containing disulfide bonds yet remain asfunctional proteins. The method of the invention produces quantities ofCRM proteins containing full, truncated or modified CRM amino acidsequences (which may be integrated or expressed from an expressionvector). Quantities of CRM protein produced according to the inventionare surprising such as, for example, 600 mg or more of CRM protein perliter of bacterial cell culture.

One embodiment of the invention is directed to methods for theproduction of large quantities of protein, and preferably CRM₁₉₇, froman E. coli host cells wherein an f-met that is present at the5′-terminus of the protein is enzymatically removed. Productionquantities are typically quantified as mg/L of bacterial cell culture.Protein production, according to the methods of the invention, is 200mg/L or more, 300 mg/L or more, 400 mg/L or more, 500 mg/L or more, 600mg/L or more, 700 mg/L or more, 800 mg/L or more, 900 mg/L or more,1,000 mg/L or more, 1,500 mg/L or more, or 2,000 mg/L or more. Proteinexpressed according to the invention include full length and truncatedproteins, as well as modified amino acid sequences of the protein.Modifications include one or more of conservative amino acid deletions,substitution and/or additions. A conservative modification is one thatmaintains the functional activity and/or immunogenicity of the molecule,although the activity and/or immunogenicity may be increased ordecreased. Examples of conservative modifications include, but are notlimited to amino acid modifications (e.g., single, double and otherwiseshort amino acid additions, deletions and/or substitutions),modifications outside of the active or functional sequence, residuesthat are accessible for conjugation in forming a vaccine, modificationsdue to serotype variations, modifications that increase immunogenicityor increase conjugation efficiency, modification that do notsubstantially alter binding to heparin, modifications that maintainproper folding or three dimensional structure, and/or modifications thatdo not significantly alter immunogenicity of the protein or the portionsof the protein that provide protective immunity.

Recombinant cells that are used in the method of the invention arepreferably E. coli bacteria and, preferably, E. coli that aregenetically engineered to shift the redox state of the cytoplasm to amore oxidative state such as, for example, by mutation of one or moredisulfide reductase genes such as, for example, an oxidoreductase, adihydrofolate reductase, a thioredoxin reductase, a glutamate cysteinelyase, a disulfide reductase, a protein reductase, and/or a glutathionereductase. Preferably one or more disulfide reductase genes are mutatedand rendered non-functional or marginally functional such that the redoxstate of the cytoplasm of the cell is shifted to a more oxidative stateas compared to wild type. Oxidative protein folding involves theformation and isomerization of disulfide bridges and plays a key role inthe stability and solubility of many proteins including CRM197.Formation and the breakage of disulfide bridges is generally catalyzedby thiol-disulfide oxidoreductases. These enzymes are characterized byone or more Trx folds that consist of a four-stranded β-sheet surroundedby three α-helices, with a CXXC redox active-site motif. The assembly ofvarious Trx modules has been used to build the different thioloxidoreductases found in prokaryotic and in eukaryotic organisms. In thebacterial periplasm, the proteins are kept in the appropriate oxidationstate by a combined action of the couples DsbB-DsbA andDsbD-DsbC/DsbE/DsbG (Inaba 2009, Gruber et al, 2006). Many proteinexpression systems are well known in the art and commercially available.

Especially preferred microbes include E. coli expression strains, forexample, chemically competent E. coli K12 cells engineered to formdisulfide bonded proteins in the cytoplasm (e.g., ORIGAMI™ (EMDMillipore) and SHUFFLE™ (New England Biolabs)). Other strains and typesof cells and other E. coli strains with enhanced oxidative redox statealso may be used. For example, ORIGAMI™ 2 host strains are K-12derivatives that have mutations in both the thioredoxin reductase (trxB)and glutathione reductase (gor) genes, which greatly enhance disulfidebond formation in the E. coli cytoplasm. These strains are kanamycinsensitive; like the original Origami strains, the gor mutation is stillselected for by tetracycline. To reduce the possibility of disulfidebond formation between molecules, strains containing mutations in trxBand gor are recommended only for the expression of proteins that requiredisulfide bond formation for proper folding. SHUFFLE™ cells arechemically competent E. coli K12 cells engineered to form proteinscontaining disulfide bonds in the cytoplasm. Preferably these cellscontain mutations in trxB and gor and cytoplasmic chaperon disulfidebond isomerase DsbC (fhuA2 [lon] ompT ahpC gal λatt::pNEB3-r1-cDsbC(SpecR, lacI^(q)) ΔtrxB sulA11 R(mcr-73::miniTn10-Tet^(S))2 [dcm]R(zgb-210::Tn10-Tet^(S)) endA1 Δgor Δ(mcrC-mrr)114::IS10). Alsopreferably, cells are suitable for T7 promoter driven protein expressionand of the genotype F′ lac, pro, lacIQ/Δ(ara-leu)7697 araD139 fhuA2lacZ::T7 gene1 Δ(phoA)PvuII phoR ahpC* galE (or U) galKλatt::pNEB3-r1-cDsbC (Spec^(R), lacI^(q)) ΔtrxB rpsL150(Str^(R)) ΔgorΔ(malF)3. SHUFFLE™ strains expresses constitutively a chromosomal copyof the disufide bond isomerase DsbC. DsbC promotes the correction ofmis-oxidized proteins into their correct form. Cytoplasmic DsbC is alsoa chaperone that can assist in the folding of proteins that do notrequire disulfide bonds.

Another embodiment of the invention comprises recombinant cells such as,for example, bacterial, mammalian or insect cells containing expressibleprotein sequences, wherein an f-met that is present at the 5′-terminusof the expressed protein is enzymatically removed. Preferred host cellsinclude, but are not limited to, cells genetically engineered to shiftthe redox state of the cytoplasm to a more oxidative state, that containand express an inducible MAP gene. Preferred cells include prokaryoticor eukaryotic cells such as, for example, E. coli cell expressionsystems, Baculovirus Expression System and other bacterial and/oreukaryotic cellular expression systems. Preferably the cells contain aprotein expression system for expressing foreign or non-nativesequences. Also preferable, the sequences to be expressed are comprisedof an expression vector which contains one or more of an induciblepromoter (e.g., auto-inducible preferably with specific media), a startcodon (e.g., ATG), a ribosome binding site, and/or a modified sequencebetween ribosome binding site and ATG starting codon, or between startcodon and the sequence to be expressed. Preferred modified sequences orspacer sequences include, for example, a number of nucleotides more orless than 9 (e.g., between 7 and 12 nucleotides), and preferably not 9nucleotides.

Another embodiment of the invention is directed to recombinant proteinand the expression of recombinant protein in E. coli or another hostcell using an expression vector with an inducible promoter and/or amodified sequence between ribosome binding site and ATG starting codon,cells wherein an f-met that is present at the 5′-terminus of therecombinant protein that is enzymatically removed. Preferably, theexpression vector includes the lactose/IPTG inducible promoter,preferably a tac promoter, and the sequence between ribosome bindingsite and ATG starting codon.

Another embodiment of the invention comprises an expression constructionof nucleotide or amino acids sequences and with or without an enhancerregion. Enhancers regions promote protein expression by adding one ormore sequences that promote nucleic acid recognition for increasedexpression (e.g., start codon, enzyme binding site, translation ortranscription factor binding site). Preferably, an enhancer of theinvention contains a ribosome binding site with a start codon upstreamof and with a coding sequence that differs from the coding sequence ofthe CRM protein.

Another embodiment of the invention is directed to recombinant proteinpurified according to the methods of invention. Purification preferablycomprises heparin or heparin-like affinity chromatography. Presence ofheparin binding site allows the use of heparin or heparin-like resins inthe purification. Heparin-like resins include resins containingfunctional sulfate groups, such as dextran sulfate, e.g. Dextran sulfate(Sterogene), Capto Devirs (GE) or sulfate esters, e.g. Cellufine Sulfate(Asahi Kasei Bioprocess).

In a first step, crude E. coli extract may be clarified, for example,preferably by centrifugation or depth filtration. Optionally clearedlysate may be fractionated further, preferably by adding salts that haveeffect on protein solubility and salting out protein expressed. In thesecond step, clarified lysate or re-solubilized salted out fractioncontaining protein may be applied, for example, to anion exchange resinunder conditions when protein is in flow through. In the third step, theflow through fraction containing protein may be applied to a column.Preferred column resins include, but are not limited to dextran sulfateresins, CELLUFINE™ resins (Chisso Corporation; chromatography gel),active sulfated resins, phosphate resins, or heparin or heparin-likeresins. Preferably binding of protein to resin is performed in a lowsalt buffer and eluted in higher salt buffer, yielding highly purifiedprotein. Preferred binding buffers contain, for example, one or moreionic reagents and/or reagents that increase conductivity, one or morechaotropic agents, NaCl, KCl, glycerol, isopropyl alcohol, ethanol,arginine, acetate, guanidine, urea, ATP, one or more mono-, di-, tri-,and/or poly-phosphates, sulfates or pyrophosphates, and combinationsthereof. Preferred elution buffers contain, for example, higherconcentration of one or more components of the binding buffer.

Other preferred purification methods include any one or combination ofan anion exchange chromatography, hydrophobic interaction chromatographyand/or Cibacron-Blue resin (CN 101265288A, U.S. Pat. No. 8,383,783).Purification method of the invention produce recombinant CRM protein(e.g., CRM₁₉₇) at high yields, as discussed herein, and with a puritylevel of greater than 80%, preferably greater than 85%, preferablygreater than 90%, preferably greater than 95%, preferably greater than99%, and preferably with an even greater purity.

Another embodiment of the invention is directed to proteins and peptidesas well as portions and domains thereof, that can be manufacturedaccording to the method of the invention. Proteins and peptidescomprise, but are not limited to, for example, those proteins andpeptides that can be cytoplasmically expressed without leader or tagsequences and at commercially significant levels according to themethods disclosed and described herein. Preferably, these proteins andpeptides show proper folding upon expression in recombinant cells of theinvention. Recombinant cells of the invention preferably show reducedactivity of one or more disulfide reductase enzymes, preferable reducedactivity of less than five disulfide reductase enzymes, preferablereduced activity of less than four disulfide reductase enzymes, and alsopreferable reduced activity of less than three disulfide reductaseenzymes. Preferably expression of the proteins and peptides is increasedin recombinant cells of the invention, but may be not reduced or notsignificantly reduced compared with expression in recombinant cell thatdoes not have reduced activity of one or more disulfide reductaseenzymes. Proteins and peptides that can be expressed in the methodsdisclosed herein include, but are not limited to, for example, tetanustoxin, tetanus toxin heavy chain proteins, diphtheria toxoid, CRM,tetanus toxoid, Pseudomonas exoprotein A, Pseudomonas aeruginosa toxoid,Bordetella pertusis toxoid, Clostridium perfringens toxoid, Escherichiacoli heat-labile toxin B subunit, Neisseria meningitidis outer membranecomplex, Hemophilus influenzae protein D, Flagellin Fli C, Horseshoecrab Haemocyanin, and fragments, derivatives, and modifications thereof.

Another embodiment of the invention is directed to portions and domainsof proteins that are expressed thereof, fused genetically or by chemicalmodification or conjugation (e.g., carbodiimide,1-cyanodimethylaminopyridinium tetrafluoroborate (CDAP)) with anothermolecule. Preferred other molecules are molecules such as, but notlimited to, other proteins, peptides, lipids, fatty acids, saccharidesand/or polysaccharides, including molecules that extend half-life (e.g.,PEG, antibody fragments such as Fc fragments), stimulate and/or increaseimmunogenicity, or reduce or eliminate immunogenicity. Many proteinscontains an N-terminal serine which useful for conjugation. Typicalconjugation partner molecules include, but are not limited to polymerssuch as, for example, bacterial polysaccharides, polysaccharides derivedfrom yeast, parasite and/or other microorganisms, polyethylene glycol(PEG) and PEG derivatives and modifications, dextrans, and derivatives,modified, fragments and derivatives of dextrans. One example of aconjugation compound is the polymer PEGASYS® (peginterferon alfa-2a).Other polymers, such as dextran, also increase the half-life of proteinsand reduce immunogenicity of the conjugate partner. Polymers may belinked randomly or directed through site specific conjugation such as,for example, by modification of N-terminal serines and/or threonines.Also, modifications may be used that selectively oxidize chemical groupsfor site specific conjugation.

Another embodiment of the invention is directed to methods of producinga peptide containing a domain, fragment and/or portion comprising:expressing the peptide from a recombinant cell containing an expressionvector that encodes the peptide, wherein the recombinant cell has areduced activity of one or more disulfide reductase enzymes and theexpression vector contains a promoter functionally linked to a codingregion of the peptide, wherein the one or more disulfide reductaseenzymes comprises one or more of an oxidoreductase, a dihydrofolatereductase, a thioredoxin reductase, or a glutathione reductase; andisolating the peptide expressed, wherein the peptide expressed issoluble and wherein the protein or peptide is expressed with an f-met atthe 5′-terminus that is removed by a peptidease that is also expressedwithin the recombinant cell. Preferably the expression vector contains aribosome binding site, an initiation codon, and, optionally, anexpression enhancer region. Preferably the recombinant cell has areduced activity of only one disulfide reductase enzyme, only twodisulfide reductase enzymes, or two or more disulfide reductase enzymes.Preferably the reduced activity of the disulfide reductase enzymesresults in a shift the redox status of the cytoplasm to a more oxidativestate as compared to a recombinant cell that does not have reducedactivity of one or more disulfide reductase enzymes. Preferably therecombinant cell is an E. coli cell or a derivative or strain of E.coli. Preferably the soluble peptide expressed comprises a nativelyfolded protein or domain of the protein. The promoter may be aconstitutive or inducible promoter, whereby expression comprisesinducing the inducible promoter with an inducing agent. Preferredinducing agents include, for example, lactose (PLac), isopropylβ-D-1-thiogalactopyranoside (IPTG), substrates and derivative ofsubstrates. In one preferred embodiment, the recombinant cell contains asecond expression vector that preferably contains a coding region for apeptidase that preferably acts upon and selectively cleaves the peptideor protein expressed from the first expression vector. Preferably thesecond expression vector contains a second promoter functionally linkedto the coding region and co-expressing comprises expressing the peptideand the peptidase. The two expression vectors may be induced togetherwith the same inducing agent, or with different inducing agents,optionally at different times. Preferably the peptidase acts on andcleaves the peptide co-expressed with the peptidase. Preferably thepeptide expressed is conjugated with a polymer such as, for example,dextran, a bacterial capsular polysaccharide, polyethylene glycol (PEG),or a fragment, derivative or modification thereof. Preferably thepeptide expressed is coupled with a polymer which includes, for example,a polysaccharide, a peptide, an antibody or portion of an antibody, alipid, a fatty acid, or a combination thereof.

Another embodiment of the invention comprises conjugates of proteinsexpressed and cleaved according to the disclosures herein includingfragments, domains, and portions thereof as disclosed and describedherein.

Another embodiment of the invention comprises fusion molecules ofproteins included fragments, domains, and portions thereof as disclosedand described herein.

Another embodiment of the invention comprises a vaccine of proteinsincluded fragments, domains, and portions thereof, as disclosed anddescribed herein.

The following examples illustrate embodiments of the invention, butshould not be viewed as limiting the scope of the invention.

EXAMPLES Example 1 CRM197 Expressed in BL21

An E. coli strain was engineered to expresses soluble recombinantproteins containing disulfate bonds and effectively removeformyl-methionine from the protein sequence. The strain was developedbased on wildly use BL21 expression strain and deposited with theAmerican Type Culture Collection. Shift from reducing to oxidativecytoplasmic environment was achieved by deletion of glutathionereductase gene. This single mutation allows disulfate bonds to be formedin the cytoplasm and does not affect cells vitality resulting in highcell density growth. In addition, BL21 gor− strain has additional copyof indigenous MAP. The coding sequence of E. coli MAP under control ofinducible promoter was inserted in the place of deleted Gor locus ofBL21 gor− to prevent further disruption of the genome. Having induciblepromoter allows initiate expression of additional MAP gene at desiredtime and only when more AMP needed to effectively removeformyl-methionine from overexpressed recombinant protein. Thetac-promoter was used as an inducible promoter for MAP. Thus, theexpression of second MAP starts at the same time as expression of arecombinant protein when IPTG is used to induce its expression. Anyother inducible promoter for MAP expression can be used.

The E. coli strain created is capable of producing soluble disulfidebonds containing proteins intracellularly in grams quantity per liter ofcell culture and sufficiently remove N-terminal methionine resulting inthe expression of native recombinant protein sequence.

Deletion of glutathione oxidoreductase used phage lambda Redrecombination (Datsenko K. A., Wanner B. L., 2000). The parental strainused was E. coli BL21 (NEB, C2530H, lot #0191508). Plasmids usedincluded CGSC, #140584, pKD46/MG1655 (7669), pKD3/BW25141/pKD3 (7631),pCP20/BT340 (7629).

Insertion deletion was used to replace Gor with sequences ofchloramphenicol genes in between two Flipase recognition sites. Oncebacteria with insertion deletion was identified by being Chloramphenicolresistant, the flipase gene in a plasmid would was introduced to removethe chloramphenicol gene. The final bacterial strain has one flipaserecognition site remained and the Gor gene deletion. Cloning of MAP genein the place of gor− locus. PCR primers were used to amplify MAP codingsequence from BL21 gor− genomic DNA. MAP cDNA was cloned into expressionvector under control of tac promoter and T7 terminator. PCR primers wereused to amplify fragment containing mAP with promoter and terminator.PCR primers were used to obtain linearized pKD3 sequence.

These two fragments were assembled to get ptac-mAP-pKD3 plasmid. MAPsequence was cloned upstream of marker gene chloramphenicol acetyltransferase (CAT) flanked by two flipase recognition target sequences.PCR primers were used to amplify DNA fragment containing tac promoter,mAP and CAT.

This fragment was transformed into pKD46 containing competent cells,plated on chloramphenicol containing LB plates, and incubated at 37° C.overnight. The resulting colonies tested negative on amp plates werepicked for PCR reactions with primers mAP insertion F and C1 mj(5′-gcccccgttttcaccatggg-3′) to check for positive clones. This cellline was designated as BL21 ΔGor::(CAT, mAP). To remove the CAT markergene, Pcp20 plasmid that contains a flipase gene was transformed intoBL21 ΔGor::(CAT, mAP) and selected for amp resistant clones at 30° C.Positive clones were streaked and grew at 42° C. overnight. The coloniesdid not grow on Ampicillin plates were selected and subjected tosequencing confirmation. Primers, Down Born and up Gor were used to doPCR on the genomic DNA of these colonies to obtain inserted sequence.These PCR products were then sequenced by using AP insert F and APinsert R primers to confirm the insertion of additional mAP at Gorlocus.

CRM197 was produced in BL21 Gormet intracellularly and subjected tointact mass analysis. F-methionine was cleaved completely and 100% ofpurified protein has mature amino acids sequence lacking f-Met.

Two main species were observed in the NO21p114 CRM sample, which wasconsistent with a CRM monomer (58,542 Da) and without an N-terminal Met(reduced by 131 Da). One main species was observed in the NO21p221 CRMsample, which was consistent with a CRM monomer without an N-terminalMet (58,411 Da) (see Table I).

TABLE I Non-Reduced CRM Observed Species Theoretical NO21p114 NO21p221CRM w/o Met 58,409 Da 58,411.6 Da 58,411.4 Da CRM w/Met 58,540 Da58,542.2 Da not observed

Example 2 IL10 Expression in E. coli

IL10 was expressed in the E. coli strain. Peptide fragmentation bytrypsin and LC-MS/MS was used to show that the protein had a molecularweight consistent with the f-Met being cleaved off the N-terminal andthat the N-terminal peptide was the expected sequence.

Example 3 Site-Specifically Modification of a Protein

1-amino, 2-alcohol compounds with an N-terminal serine or threonine areselectively oxidized to an aldehyde by hypervalent iodine compounds(e.g., metaperiodate) (J. Pept. Sci. 2016; 22: 271-279) to create aprotein with a unique N-terminal aldehyde. As there are few to noaldehydes naturally in proteins, this N-terminal aldehyde is selectivelymodified or functionalized. Aminooxy, hydrazide and hydrazine reagentsare particularly useful for functionalizing of aldehydes. Some examplesof these reagents for modifying aldehydes are described in BioconjugateChem. 1990, 7, 96 and Vaccine, 2006, 24:716. Using standard geneticengineering methods, sequences for proteins containing N-terminal serineor threonine are introduced into E. coli bacteria for expression.

Proteins are synthesized in E. coli with f-Met as the starting aminoacid which is normally cleaved off by proteases. However, highexpression levels of heterologous proteins in E. coli can result in asignificant fraction of the protein f-Met to be not fully trimmed, thusblocking the expected N-terminal serine or threonine. The N-terminalf-Met is efficiently cleaved off and thus allows for proteins to beexpressed so that the N-terminal serine or threonine is available forfacile oxidation.

Example 4 Terminal Modification of a Protein

CRM₁₉₇ is expressed as described in Example 1, but with an N-terminalserine or threonine. Oxidation of the N-terminal serine or threonine iscarried out as generally described in Bioconjugate Chem. 1994, 5,636-646. The CRM₁₉₇ is purified and prepared at 5 mg/ml in sodiumphosphate buffer, pH 6.8, treated with a 5× molar excess of sodiummetaperiodate for 8 minutes and then quenched with a 10× molar excess ofglycerol. The N-terminal oxidized protein is desalted, concentrated to 5mg/ml, pH 8 HEPES. The protein is then reacted with an aminooxy reagentsuch as an 1-(Aminooxy)-4-[(3-nitro-2-pyridyl)dithio]butane(Bioconjugate Chem. 1990, 7, 96-99), optionally in the presence of amild reducing agent such as sodium cyanoborohydride. Followingpurification, the product is a protein with the N-terminal modified witha thiol-pyridyl disulfide. The disulfide is reduced with a lowconcentration of a reducing agent (e.g., DTT) to prepare CRM₁₉₇ with asingle terminal thiol. The monothiol protein is then reacted with anantigen or other molecule containing a suitable electrophilic group, forexample, a polysaccharide functionalized with maleimide groups.

The monothiol is further modified by reaction with thiol-reactivereagents. The thiol is converted to a maleimide group by the addition ofan excess of a di-maleimide reagent. The N-terminal aldehyde is labeledwith useful groups, including click reagents, biotin, fluorescentcompounds, maleimides, bromoacetyl groups and more. A catalyst, such asan analine or 4-amino phenylalanine is used to promote oxime formation(J. Pept. Sci. 2016; 22: 271-279). The oxime can optionally be reducedusing a reagent such as sodium cyanoborohydride. The reducing reagentmay be present during the oximation reaction if it does not affect thealdehyde or may be added at the conclusion of the reaction. Hydrazideand hydrazine reagents may also be reacted with the oxidized N-terminalamino acid instead of aminooxy reagents.

Example 5 Bacterial Strain with Elevated Activity of MethionineAminopeptidase (MAP) that Forms Disulfide Bonds in the Cytoplasm

E. coli strains used to express recombinant proteins can have a reducingintracellular environment preventing formation of disulfide bonds.However, many proteins require disulfide bonds for proper folding,solubility, and function. Expression in the E. coli reducing cytoplasmleads to insoluble expression of recombinant proteins (e.g., inclusionbodies). These proteins must then be refolded, a frequently challengingand largely unsuccessful process. To achieve soluble production of suchproteins in E. coli, expression can be directed to the periplasm.

Periplasmic expression results in low expression yield due to thelimited capacity of E. coli secreted pathways. Commercially availablestrains of E. coli are available that contain multiple genes knocked outto create an oxidizing intracellular environment. Such commerciallyavailable E. coli strains include SHuffle® (New England Bio) andOrigami™ (Millipore). Neither of these strains are well-suited forrecombinant protein manufacturing due to impaired growth and low cellmass accumulation.

An E. coli strain (GorMet) was created that contains an oxidizingintracellular environment and is capable of producing solublecytoplasmic proteins, including ones containing disulfide bonds. Aproperty of the strain is that it not crippled and can generate largeamounts of cell mass and thus is highly suitable for recombinant proteinmanufacturing.

In contrast to previous E. coli strains with an oxidative intracellularenvironment, the GorMet strain has only a single gene deletion, ofglutathione reductase to shift intracellular environment of E. colicells to oxidative state to allow disulfide formation. Surprisingly,this single gene deletion creates a suitable oxidative intracellularenvironment which does not affect cell growth, allowing the expressionof grams of recombinant proteins per liter of cell culture Deletion ofglutathione reductase from BL21 genome

-   -   1. PCR was used to introduce upstream and downstream sequences        of the BL21 Gor gene to 3′ and 5′ of a chloramphenicol acetyl        transferase gene framed with Flippase recognition sites,        respectively.    -   2. BL21 cells were transformed with a temperature sensitive        plasmid expressing red recombinase and induced the recombinase        activity.    -   3. The cells resulting from step 2 were transformed with the        step 1 PCR product to replace Gor gene with chloramphenicol        acetyl transferase gene.    -   4. Positive clones were selected by chloramphenicol resistance.    -   5. The plasmid containing Red recombinase was removed by growing        the transformed BL21 cells at 42° C.    -   6. The transformed temperature sensitive plasmid containing        flippase was used to remove the chloramphenicol acetyl        transferase gene from the genome.    -   7. The cells of step 6 were grown at 42° C. to remove plasmid        containing flippase.    -   8. Sequencing was performed to confirm the gene insertion at the        correct locus.

This created a BL21 E. coli strain with an oxidative intracellularenvironment.

A substantial amount of the recombinant protein produced in amicroorganism contains methionine as the first amino acid. As Met is nota part of mature protein sequence, this is not desirable for manyproteins. Expression of recombinant proteins (heterologous) in E. coliinitiates at the coding sequence of the protein, the ATG codon. This istranslated to formyl-methionine (N-terminal Met). During theintracellular expression of proteins, the N-terminal Met is excised byendogenous E. coli methionine aminopeptidase (or peptidases). Thisprocess is not efficient even if the residue adjacent to the N-terminalmethionine is optimal for cleavage due to overexpression of the proteinand limited amount of MAP.

Present methods to cleave formyl-methionine include in vitro treatmentwith recombinant MAP or adding the MAP coding sequence to the expressionvector in order to co-express it with the recombinant protein. All thesemethods are time consuming and costly as a new expression vector must becreated and optimized for each new recombinant protein.

In this example, the MAP gene was integrated into the genome of this E.coli strain, with MAP expression controlled by a promoter. Recombinantproteins expressed had the formyl-methionine removed from their Nterminus such that no or very low levels of N-terminal methionine werepresent on expressed proteins, and removal did not the disulfide bondsformation for solubility and function, in the cytoplasm of E. coli.Recombinant proteins expressed in this strain were soluble, have theirdisulfide bonds correctly formed and possess the native sequence withoutan N-terminal methionine.

The strain BL21 E. coli gor⁻ was engineering further by inserting anextra copy of bacterial methionine aminopeptidase (MAP) under thecontrol of an inducible promoter into the gor⁻ locus. Method for MAPgene insertion:

-   -   1. PCR was used to obtain a DNA fragment containing the coding        sequence of methionine aminopeptidase (MAP) from E. coli BL21        genome.    -   2. The MAP coding sequence was cloned into a bacterial        expression vector under the control of the tac promoter.    -   3. PCR was used to amplify the MAP sequence along with the        promoter and terminator using the plasmid from the step 2 as a        template.    -   4. PCR was used to add the methionine aminopeptidase gene with        promoter and terminator from step 3 upstream of the        Chloramphenicol acetyl transferase promoter on pKD3 vector.    -   5. PCR was used to obtain a gene cassette that contained the        methionine aminopeptidase plus Chloramphenicol acetyl        transferase gene with their respective promoters and terminators        with additional sequences of upstream and downstream glutathione        reductase gene deletion locus (gor−).    -   6. Red Recombineering was performed to insert the gene cassette        made in step 5 into the Gor locus in BL21 Gor−.    -   7. Flippase was used to remove the Chloramphenicol acetyl        transferase gene from the genome.    -   8. Sequencing was performed to confirm that the gene insertion        occurred at the correct locus.

With an inducible promoter (Tac), the initiation of expression of theadditional MAP gene at the desired time and only when more AMP needed toeffectively remove the formyl-methionine from the N-terminal of theoverexpressed recombinant protein. The expression of the second MAPstarts at the same time as the expression of a recombinant protein whenIPTG is the inducer. Any other inducible promoter for MAP expression canbe used.

Example 6. Expression Genetically Detoxified Diphtheria Mutant CRM₁₉₇ inthe Gormet Strain and in the Gor− E. coli Strain

The expression level was greater than about 2 g per liter fermenter cellculture for both strains, indicating the co-expression of the MAP didnot significantly affect the expression of the CRM197. Purified CRM197for each strain was analyzed by mass spectrometry (see Table II).

TABLE II Major species With N-met Without N-met 58,540 58,409 Expressionsystem Gor 58,541 GorMet 58,411

It is seen that the major species for CRM197 expressed in the GorMetstrain had an appropriate molecular weight for the N terminal Met beingcleaved, whereas expression in the Gor strain resulted CRM197 with anN-terminal methionine.

Example 7. Expression of Cytokine IL10 from Epstein-Barr Virus in theGormet Strain

IL10 was expressed soluble intracellularly. Purified IL10 was subjectedto mass spectrometry analysis. Following enzymatic digestion withtrypsin, the sample was analyzed by LC-MS/MS. The IL10 amino acidsequence did not contain N-terminal methionine.

Example 8. Use of GorMet Strain to Produce Proteins that can be SiteSpecifically Modified at the N-Terminal

It can be desirable to site-specifically modify a protein. 1-amino,2-alcohol compounds, such as N-terminal serine or threonine can beselectively oxidized to an aldehyde (J. Pept. Sci. 2016; 22: 271-279) tocreate a protein with a unique N-terminal aldehyde. As there are few tono aldehydes naturally in proteins, this N-terminal aldehyde can then beselectively modified or functionalized. Aminooxy, hydrazide andhydrazine reagents are particularly useful for functionalizing ofaldehydes. Some examples of these reagents for modifying aldehydes aredescribed in Bioconjugate Chem. 1990, 7, 96 and Vaccine, 2006, 24:716.Using standard genetic engineering methods, sequences for proteinscontaining 1-amino, 2-alcohol amino acids can be introduced into genesbeing expressed in E. coli bacteria. These N-terminal amino acids canthen be oxidized to create an aldehyde for site specific modification.However, if the N-terminal Met is not efficiently removed, the1-amino,2-alcohol group cannot be easily selectively oxidized.

The E. coli strains created provide a method for efficiently cleavingoff the N-terminal Met and thus allows for proteins to be expressed sothat the N-terminal serine or threonine is available for facileoxidation. A gene for expressing a protein with an N-terminal 1-amino,2-alcohol amino acid was introduced into the GorMet strain, expressedand purified. The purified protein had the N-terminal Met cleaved,leaving the 1-amino, 2-alcohol amino acid on the N-terminal. Oxidationof this amino acid was carried out as generally described inBioconjugate Chem. 1994, 5, 636-646. The protein, containing anN-terminal aldehyde was further reacted.

A gene containing the CRM197 sequence plus an N-terminal serine orthreonine was introduced into the GorMet E. coli as described. Theexpressed CRM197 has essentially all of the Met cleaved by the MAP sothat the protein has an N-terminal serine or threonine. The CRM₁₉₇ waspurified and prepared at 5 mg/ml in sodium phosphate buffer, pH 6.8,treated with a 5× molar excess of sodium metaperiodate for 8 minutes andthen quenched with a 10× molar excess of glycerol. The N-terminaloxidized protein was desalted, concentrated to 5 mg/ml, pH 8 HEPES. Thisprotein was reacted with the aminooxy reagent1-(Aminooxy)-4-[(3-nitro-2-pyridyl)dithio]butane (Bioconjugate Chem.1990, 7, 96-99). This was performed in the presence of a mild reducingagent (sodium cyanoborohydride), which is optional. Followingpurification, the product produced was a protein with the N-terminalmodified with a thiol-pyridyl disulfide. The disulfide was reduced witha low concentration of a reducing agent (DTT, although any wouldsuffice) to prepare CRM₁₉₇ with a single terminal thiol. The monothiolprotein was reacted with an antigen containing a suitable electrophilicgroup, a polysaccharide functionalized with maleimide groups.

The monothiol was modified by reaction with thiol-reactive reagents. Thethiol was converted to a maleimide group by the addition of an excess ofa di-maleimide reagent. The N-terminal aldehyde was labeled with auseful group (e.g., click reagents, biotin, fluorescent compounds,maleimides, bromoacetyl groups and many more). As such, it can bereacted with hydrazide, hydrazine, aminooxy groups or other reagentsknown to selectively react with aldehydes. The protein with anN-terminal aldehyde can be reacted with a suitably modified protein, forexample a hydrazide derivatized protein, polymer, polysaccharide,oligosaccharide or peptide. It may also be reacted with suitablymodified surfaces, including but not limited to, beads, chromatographyresins or membranes. A catalyst, such as an analine or 4-aminophenylalanine can be used to promote oxime formation (J. Pept. Sci.2016; 22: 271-279). The oxime can be reduced using a reagent such assodium cyanoborohydride. The reducing reagent may be present during theoximation reaction if it does not affect the aldehyde or may be added atthe conclusion of the reaction. Hydrazide and hydrazine reagents mayalso be reacted with the oxidized N-terminal amino acid instead ofaminooxy reagents.

Other embodiments and uses of the invention will be apparent to thoseskilled in the art from consideration of the specification and practiceof the invention disclosed herein. All references cited herein,including all publications, U.S. and foreign patents and patentapplications, are specifically and entirely incorporated by reference.The term comprising, where ever used, is intended to include the termsconsisting and consisting essentially of. Furthermore, the termscomprising, including, containing and the like are not intended to belimiting. It is intended that the specification and examples beconsidered exemplary only with the true scope and spirit of theinvention indicated by the following claims.

The invention claimed is:
 1. A method of producing a protein containingone or more sulfide linkages comprising: expressing the protein from arecombinant cell containing a genome and an expression vector thatencodes the protein sequence, wherein the recombinant cell has a reducedactivity of one or more disulfide reductase enzymes, and the 5′ terminusof the protein contains a methionine; expressing a peptidase from a geneof the recombinant cell, wherein the peptidase removes the methioninefrom the 5′ terminus of the protein expressed; and isolating theprotein.
 2. The method of claim 1, wherein the protein expressedcomprises tetanus toxin, tetanus toxin heavy chain proteins, diphtheriatoxoid, tetanus toxoid, Pseudomonas exoprotein A, Pseudomonas aeruginosatoxoid, Bordetella pertussis toxoid, Clostridium perfringens toxoid,Escherichia coli (E. coli) heat-labile toxin B subunit, Neisseriameningitidis outer membrane complex, Hemophilus influenzae protein D,Flagellin Fli C, or Horseshoe crab Haemocyanin.
 3. The method of claim1, wherein the recombinant cell has a reduced activity of only onedisulfide reductase enzyme.
 4. The method of claim 1, wherein thereduced activity of more than one disulfide reductase enzymes.
 5. Themethod of claim 1, wherein the recombinant cell is an E. coli cell or aderivative or strain of E. coli.
 6. The method of claim 1, wherein thepeptidase comprises a methionine aminopeptidase.
 7. The method of claim1, wherein the expression vector contains a ribosome binding site, aninitiation codon, and/or an expression enhancer region.
 8. The method ofclaim 1, wherein the expression vector contains an inducible firstpromoter and expressing the protein comprises inducing the induciblefirst promoter with a first inducing agent.
 9. The method of claim 1,wherein the gene contains an inducible second promoter and expressingthe peptidase comprises inducing the inducible second promoter with asecond inducing agent.
 10. The method of claim 1, wherein the expressionvector contains an inducible first promoter and expressing the proteincomprises inducing the inducible first promoter with a first inducingagent, the gene contains an inducible second promoter and expressing thepeptidase comprises inducing the inducible second promoter with a secondinducing agent, and the first inducing agent and the second inducingagent are the same.
 11. The method of claim 1, wherein the peptidasegene is integrated into the genome of the recombinant cell.
 12. Themethod of claim 1, wherein isolating comprises chromatography.
 13. Themethod of claim 12, wherein the chromatography comprises a sulfateresin, a gel resin, an active sulfated resin, a phosphate resin, aheparin resin or a heparin-like resin.
 14. The method of claim 1,further comprising conjugating or coupling the isolated protein with achemical compound.
 15. The method of claim 14, wherein the chemicalcompound comprises a polysaccharide, a polymer, a polyethylene glycol, aderivative of polyethylene glycol, a peptide, an antibody or portion ofan antibody, a lipid, a fatty acid, or a combination thereof.